home *** CD-ROM | disk | FTP | other *** search
-
-
-
- DDDDPPPPLLLLAAAACCCCEEEE((((1111)))) DDDDPPPPLLLLAAAACCCCEEEE((((1111))))
-
-
-
- NNNNAAAAMMMMEEEE
- ddddppppllllaaaacccceeee - a NUMA memory placement tool
-
- SSSSYYYYNNNNOOOOPPPPSSSSIIIISSSS
- dplace [----ppppllllaaaacccceeee _p_l_a_c_e_m_e_n_t__f_i_l_e]
- [----ddddaaaattttaaaa____ppppaaaaggggeeeessssiiiizzzzeeee _n-_b_y_t_e_s]
- [----ddddaaaattttaaaa____llllppppaaaaggggeeee____wwwwaaaaiiiitttt [_o_f_f|_o_n]]
- [----ssssttttaaaacccckkkk____ppppaaaaggggeeeessssiiiizzzzeeee _n-_b_y_t_e_s]
- [----ssssttttaaaacccckkkk____llllppppaaaaggggeeee____wwwwaaaaiiiitttt [_o_f_f|_o_n]]
- [----tttteeeexxxxtttt____ppppaaaaggggeeeessssiiiizzzzeeee _n-_b_y_t_e_s]
- [----tttteeeexxxxtttt____llllppppaaaaggggeeee____wwwwaaaaiiiitttt [_o_f_f|_o_n]]
- [----mmmmiiiiggggrrrraaaattttiiiioooonnnn [_o_f_f|_o_n|_t_h_r_e_s_h_o_l_d]]
- [----mmmmiiiiggggrrrraaaattttiiiioooonnnn____lllleeeevvvveeeellll _t_h_r_e_s_h_o_l_d]
- [----pppprrrrooooppppaaaaggggaaaatttteeee]
- [----mmmmuuuussssttttrrrruuuunnnn]
- [----vvvv[[[[eeeerrrrbbbboooosssseeee]]]]]
- _p_r_o_g_r_a_m [_p_r_o_g_r_a_m-_a_r_g_u_m_e_n_t_s]
-
-
- DDDDEEEESSSSCCCCRRRRIIIIPPPPTTTTIIIIOOOONNNN
- The given _p_r_o_g_r_a_m is executed after placement policies are set up
- according to command line arguments and the specifications described in
- _p_l_a_c_e_m_e_n_t__f_i_l_e.
-
-
- OOOOPPPPTTTTIIIIOOOONNNNSSSS
- ----ppppllllaaaacccceeee _p_l_a_c_e_m_e_n_t__f_i_l_e
- Placement information is read from _p_l_a_c_e_m_e_n_t__f_i_l_e. If this argument
- is omitted, no input file is read. See ddddppppllllaaaacccceeee(5) for correct
- placement file format.
-
- ----ddddaaaattttaaaa____ppppaaaaggggeeeessssiiiizzzzeeee _n-_b_y_t_e_s
- Data and heap page sizes will be of size _n-_b_y_t_e_s. Valid page sizes
- are 16k multiplied by a non negative integer powers of 4 up to a
- maximum size of 16m. Valid page sizes are 16k, 64k, 256k, 1m, 4m,
- and 16m.
-
- ----ddddaaaattttaaaa____llllppppaaaaggggeeee____wwwwaaaaiiiitttt [_o_f_f|_o_n]
- Normal behavior in the event of large page shortages is to continue
- running utilizing smaller pages instead. If this option is specified
- as _o_n then the process will wait until large pages become available
- for use by the data segment.
-
- ----ssssttttaaaacccckkkk____ppppaaaaggggeeeessssiiiizzzzeeee _n-_b_y_t_e_s
- Stack page sizes will be of size _n-_b_y_t_e_s. Valid page sizes are 16k
- multiplied by a non negative integer powers of 4 up to a maximum
- size of 16m. Valid page sizes are 16k, 64k, 256k, 1m, 4m, and 16m.
-
- ----ssssttttaaaacccckkkk____llllppppaaaaggggeeee____wwwwaaaaiiiitttt [_o_f_f|_o_n]
- Normal behavior in the event of large page shortages is to continue
- running utilizing smaller pages instead. If this option is specified
- as _o_n then the process will wait until large pages become available
-
-
-
- PPPPaaaaggggeeee 1111
-
-
-
-
-
-
- DDDDPPPPLLLLAAAACCCCEEEE((((1111)))) DDDDPPPPLLLLAAAACCCCEEEE((((1111))))
-
-
-
- for use by the stack segment.
-
- ----tttteeeexxxxtttt____ppppaaaaggggeeeessssiiiizzzzeeee _n-_b_y_t_e_s
- Text page sizes will be of size _n-_b_y_t_e_s. Valid page sizes are 16k
- multiplied by a non negative integer powers of 4 up to a maximum
- size of 16m. Valid page sizes are 16k, 64k, 256k, 1m, 4m, and 16m.
-
- ----tttteeeexxxxtttt____llllppppaaaaggggeeee____wwwwaaaaiiiitttt [_o_f_f|_o_n]
- Normal behavior in the event of large page shortages is to continue
- running utilizing smaller pages instead. If this option is specified
- as _o_n then the process will wait until large pages become available
- for use by the stack segment.
-
- ----mmmmiiiiggggrrrraaaattttiiiioooonnnn [_o_f_f|_o_n|_t_h_r_e_s_h_o_l_d]
- Page migration is turned on or off. If a threshold is specified then
- page migration will be turned on and the migration threshold will be
- set in the same manner as when ----mmmmiiiiggggrrrraaaattttiiiioooonnnn____lllleeeevvvveeeellll is specified (see
- below).
-
- ----mmmmiiiiggggrrrraaaattttiiiioooonnnn____lllleeeevvvveeeellll _t_h_r_e_s_h_o_l_d
- Page migration threshold is set to _t_h_r_e_s_h_o_l_d. This value specifies
- the maximum percentage difference between the number of remote
- memory accesses and local memory accesses (relative to maximum
- counter values ) for a given page, before a migration request event
- occurs. A special argument of 0 will turn page migration off. This
- option is provided for backward compatibility only, new scripts
- should use the ----mmmmiiiiggggrrrraaaattttiiiioooonnnn option (see above) instead.
-
- ----pppprrrrooooppppaaaaggggaaaatttteeee
- Migration and page size information will be inherited by descendents
- which are exec'ed.
-
- ----mmmmuuuussssttttrrrruuuunnnn
- When threads are attached to memories or cpus, the threads are
- attached to CPUs on the node using process_cpulink with a request
- mode of mandatory.
-
- ----vvvveeeerrrrbbbboooosssseeee or ----vvvv
- Detailed diagnostic information is written to standard error.
-
- EEEEXXXXAAAAMMMMPPPPLLLLEEEE
- To place data according to the file _p_l_a_c_e_m_e_n_t__f_i_l_e for the executable
- a.out that would normally be run by:
- %%%% aaaa....oooouuuutttt <<<< iiiinnnn >>>> oooouuuutttt
- one would simply
- %%%% ddddppppllllaaaacccceeee ----ppppllllaaaacccceeee ppppllllaaaacccceeeemmmmeeeennnntttt____ffffiiiilllleeee aaaa....oooouuuutttt <<<< iiiinnnn >>>> oooouuuutttt .
-
- An example placement file _p_l_a_c_e_m_e_n_t__f_i_l_e, when a.out is two threaded
- might look like:
-
- #### ppppllllaaaacccceeeemmmmeeeennnntttt____ffffiiiilllleeee
- mmmmeeeemmmmoooorrrriiiieeeessss 2222 iiiinnnn ttttooooppppoooollllooooggggyyyy ccccuuuubbbbeeee #### sssseeeetttt uuuupppp 2222 mmmmeeeemmmmoooorrrriiiieeeessss wwwwhhhhiiiicccchhhh aaaarrrreeee cccclllloooosssseeee
-
-
-
- PPPPaaaaggggeeee 2222
-
-
-
-
-
-
- DDDDPPPPLLLLAAAACCCCEEEE((((1111)))) DDDDPPPPLLLLAAAACCCCEEEE((((1111))))
-
-
-
- tttthhhhrrrreeeeaaaaddddssss 2222 #### nnnnuuuummmmbbbbeeeerrrr ooooffff tttthhhhrrrreeeeaaaaddddssss
- rrrruuuunnnn tttthhhhrrrreeeeaaaadddd 0000 oooonnnn mmmmeeeemmmmoooorrrryyyy 1111 #### rrrruuuunnnn tttthhhheeee ffffiiiirrrrsssstttt tttthhhhrrrreeeeaaaadddd oooonnnn tttthhhheeee 2222nnnndddd mmmmeeeemmmmoooorrrryyyy
- rrrruuuunnnn tttthhhhrrrreeeeaaaadddd 1111 oooonnnn mmmmeeeemmmmoooorrrryyyy 0000 #### rrrruuuunnnn tttthhhheeee 2222nnnndddd tttthhhhrrrreeeeaaaadddd oooonnnn tttthhhheeee ffffiiiirrrrsssstttt mmmmeeeemmmmoooorrrryyyy
-
-
- This specification, would request 2 nearby memories from the operating
- system. At creation, the threads are requested to run on an available cpu
- which is local to the specified memory. As data and stack space is
- touched or faulted in, physical memory is allocated from the memory which
- is local to the thread which initiated the fault.
-
- This can be written in a _s_c_a_l_a_b_l_e way for a variable number of threads
- using the environment variable NP as follows:
-
- #### ssssccccaaaallllaaaabbbblllleeee ppppllllaaaacccceeeemmmmeeeennnntttt____ffffiiiilllleeee
- mmmmeeeemmmmoooorrrriiiieeeessss $$$$NNNNPPPP iiiinnnn ttttooooppppoooollllooooggggyyyy ccccuuuubbbbeeee #### sssseeeetttt uuuupppp mmmmeeeemmmmoooorrrriiiieeeessss wwwwhhhhiiiicccchhhh aaaarrrreeee cccclllloooosssseeee
- tttthhhhrrrreeeeaaaaddddssss $$$$NNNNPPPP #### nnnnuuuummmmbbbbeeeerrrr ooooffff tttthhhhrrrreeeeaaaaddddssss
- #### rrrruuuunnnn tttthhhheeee llllaaaasssstttt tttthhhhrrrreeeeaaaadddd oooonnnn tttthhhheeee ffffiiiirrrrsssstttt mmmmeeeemmmmoooorrrryyyy eeeettttcccc....
- ddddiiiissssttttrrrriiiibbbbuuuutttteeee tttthhhhrrrreeeeaaaaddddssss $$$$NNNNPPPP----1111::::0000::::----1111 aaaaccccrrrroooossssssss mmmmeeeemmmmoooorrrriiiieeeessss
-
-
-
- UUUUSSSSIIIINNNNGGGG MMMMPPPPIIII
- Since most MPI implementations use $MPI_NP+1 threads; where the first
- thread is mainly inactive. One might use the placement file:
-
- #### ssssccccaaaallllaaaabbbblllleeee ppppllllaaaacccceeeemmmmeeeennnntttt____ffffiiiilllleeee ffffoooorrrr MMMMPPPPIIII
- mmmmeeeemmmmoooorrrriiiieeeessss (((($$$$MMMMPPPPIIII____NNNNPPPP ++++ 1111))))////2222 iiiinnnn ttttooooppppoooollllooooggggyyyy ccccuuuubbbbeeee #### sssseeeetttt uuuupppp mmmmeeeemmmmoooorrrriiiieeeessss wwwwhhhhiiiicccchhhh aaaarrrreeee cccclllloooosssseeee
- tttthhhhrrrreeeeaaaaddddssss $$$$MMMMPPPPIIII____NNNNPPPP ++++ 1111 #### nnnnuuuummmmbbbbeeeerrrr ooooffff tttthhhhrrrreeeeaaaaddddssss
- #### iiiiggggnnnnoooorrrreeee tttthhhheeee llllaaaazzzzyyyy tttthhhhrrrreeeeaaaadddd
- ddddiiiissssttttrrrriiiibbbbuuuutttteeee tttthhhhrrrreeeeaaaaddddssss 1111::::$$$$MMMMPPPPIIII____NNNNPPPP aaaaccccrrrroooossssssss mmmmeeeemmmmoooorrrriiiieeeessss
-
-
- When using MPI with ddddppppllllaaaacccceeee, syntax similar to the following should be
- used:
- mmmmppppiiiirrrruuuunnnn ----nnnnpppp <<<<nnnnuuuummmmbbbbeeeerrrr____ooooffff____pppprrrroooocccceeeesssssssseeeessss>>>> ddddppppllllaaaacccceeee <<<<ddddppppllllaaaacccceeee____aaaarrrrggggssss>>>> aaaa....oooouuuutttt
-
-
- LLLLAAAARRRRGGGGEEEE PPPPAAAAGGGGEEEESSSS
- Some applications run more efficiently using large pages. To run a
- program a.out utilizing 64k pages for both stack and data, a placement
- file is not necessary. One need only invoke the command:
- ddddppppllllaaaacccceeee ----ddddaaaattttaaaa____ppppaaaaggggeeeessssiiiizzzzeeee 66664444kkkk ----ssssttttaaaacccckkkk____ppppaaaaggggeeeessssiiiizzzzeeee 66664444kkkk aaaa....oooouuuutttt
- from the shell.
-
-
- PPPPHHHHYYYYSSSSIIIICCCCAAAALLLL PPPPLLLLAAAACCCCEEEEMMMMEEEENNNNTTTT
- Physical placement can also be accomplished using dplace. The following
- placement file:
-
-
- #### pppphhhhyyyyssssiiiiccccaaaallll ppppllllaaaacccceeeemmmmeeeennnntttt____ffffiiiilllleeee ffffoooorrrr 3333 ssssppppeeeecccciiiiffffiiiicccc mmmmeeeemmmmoooorrrriiiieeeessss aaaannnndddd 6666 tttthhhhrrrreeeeaaaaddddssss
-
-
-
- PPPPaaaaggggeeee 3333
-
-
-
-
-
-
- DDDDPPPPLLLLAAAACCCCEEEE((((1111)))) DDDDPPPPLLLLAAAACCCCEEEE((((1111))))
-
-
-
- mmmmeeeemmmmoooorrrriiiieeeessss 3333 iiiinnnn ttttooooppppoooollllooooggggyyyy pppphhhhyyyyssssiiiiccccaaaallll nnnneeeeaaaarrrr \\\\
- ////hhhhwwww////mmmmoooodddduuuulllleeee////2222////ssssllllooootttt////nnnn4444////nnnnooooddddeeee \\\\
- ////hhhhwwww////mmmmoooodddduuuulllleeee////3333////ssssllllooootttt////nnnn2222////nnnnooooddddeeee \\\\
- ////hhhhwwww////mmmmoooodddduuuulllleeee////4444////ssssllllooootttt////nnnn3333////nnnnooooddddeeee
- tttthhhhrrrreeeeaaaaddddssss 6666
- ####tttthhhheeee ffffiiiirrrrsssstttt ttttwwwwoooo tttthhhhrrrreeeeaaaaddddssss ((((0000 &&&& 1111 )))) wwwwiiiillllllll rrrruuuunnnn oooonnnn ////hhhhwwww////mmmmoooodddduuuulllleeee////2222////ssssllllooootttt////nnnn4444////nnnnooooddddeeee
- ####tttthhhheeee sssseeeeccccoooonnnndddd ttttwwwwoooo tttthhhhrrrreeeeaaaaddddssss ((((2222 &&&& 3333 )))) wwwwiiiillllllll rrrruuuunnnn oooonnnn ////hhhhwwww////mmmmoooodddduuuulllleeee////3333////ssssllllooootttt////nnnn2222////nnnnooooddddeeee
- ####tttthhhheeee llllaaaasssstttt ttttwwwwoooo tttthhhhrrrreeeeaaaaddddssss ((((4444 &&&& 5555 )))) wwwwiiiillllllll rrrruuuunnnn oooonnnn ////hhhhwwww////mmmmoooodddduuuulllleeee////4444////ssssllllooootttt////nnnn3333////nnnnooooddddeeee
- ddddiiiissssttttrrrriiiibbbbuuuutttteeee tttthhhhrrrreeeeaaaaddddssss aaaaccccrrrroooossssssss mmmmeeeemmmmoooorrrriiiieeeessss
-
-
- specifies three physical nodes using the proper /hw path. To find out the
- names of the memory nodes on the machine you are using, type "find /hw
- -name node -print" at the shell command prompt.
-
-
- MMMMUUUUSSSSTTTTRRRRUUUUNNNN
- The mustrun option will bind a thread to a particular CPU on the node.
- In cases where threads are distributed across memories, the CPU selection
- will attempt to schedule CPUs as to maximize CPU to memory bandwidth.
- For example, on an Origin 3000, CPUs sharing a memory will be scheduled
- on two separate processor busses available on the node, which serves to
- increase available memory bandwidth for each processor. The following
- placement file will demonstrate this behaviour:
-
-
- #### ppppllllaaaacccceeeemmmmeeeennnntttt ffffiiiilllleeee ffffoooorrrr 4444 mmmmeeeemmmmoooorrrriiiieeeessss aaaannnndddd 8888 tttthhhhrrrreeeeaaaaddddssss
- tttthhhhrrrreeeeaaaaddddssss 8888
- mmmmeeeemmmmoooorrrriiiieeeessss 4444
- ddddiiiissssttttrrrriiiibbbbuuuutttteeee tttthhhhrrrreeeeaaaaddddssss aaaaccccrrrroooossssssss mmmmeeeemmmmoooorrrriiiieeeessss
-
-
- By running this placement file using the option -_m_u_s_t_r_u_n the CPU
- selection on each memory of an Origin 3000 will attempt to select CPU
- numbers 0 and 2, or 1 and 3, or 0 and 3, or 1 and 2. The selection will
- avoid pairing 0 and 1, or 2 and 3, as these CPUs share a processor
- interface bus. In a situation where a processor selection cannot be done
- optimally as described, then the next available CPU regardless of bus
- attachment will be selected.
-
-
-
- DDDDEEEEFFFFAAAAUUUULLLLTTTTSSSS
- If command line arguments are omitted, dplace chooses the following set
- of defaults:
-
- ppppllllaaaacccceeee ////ddddeeeevvvv////nnnnuuuullllllll
- ddddaaaattttaaaa____ppppaaaaggggeeeessssiiiizzzzeeee 11116666kkkk
- ssssttttaaaacccckkkk____ppppaaaaggggeeeessssiiiizzzzeeee 11116666kkkk
- tttteeeexxxxtttt____ppppaaaaggggeeeessssiiiizzzzeeee 11116666kkkk
- mmmmiiiiggggrrrraaaattttiiiioooonnnn ooooffffffff
- pppprrrrooooppppaaaaggggaaaatttteeee ooooffffffff
-
-
-
- PPPPaaaaggggeeee 4444
-
-
-
-
-
-
- DDDDPPPPLLLLAAAACCCCEEEE((((1111)))) DDDDPPPPLLLLAAAACCCCEEEE((((1111))))
-
-
-
- mmmmuuuussssttttrrrruuuunnnn ooooffffffff
- vvvveeeerrrrbbbboooosssseeee ooooffffffff
-
-
- RRRREEEESSSSTTTTRRRRIIIICCCCTTTTIIIIOOOONNNNSSSS
- Programs must be dynamic executables; non shared executables behavior are
- are unaffected by ddddppppllllaaaacccceeee. Placement files will only affect direct
- descendents of dplace. Parallel applications must be based on the
- sssspppprrrroooocccc(2) or ffffoooorrrrkkkk(2) mechanism. Page sizes for regions which are not
- stack, text, or data can not be specified with dplace (eg: SYSV shared
- memory). Regions shared by multiple processes (eg: DSO text) are faulted
- in with the pagesize settings of the faulting process. Dplace sets the
- environment variable _DSM_OFF which will disable lllliiiibbbbmmmmpppp's own DSM
- directives and environment variables.
-
-
- EEEENNNNVVVVIIIIRRRROOOONNNNMMMMEEEENNNNTTTT
- Dplace recognizes and uses the environment variables PAGESIZE_DATA,
- PAGESIZE_STACK and PAGESIZE_TEXT. When using these variables it is
- important to note that the units are in kilobytes. The command line
- option will override environment variable setting.
-
-
- EEEERRRRRRRROOOORRRRSSSS
- If errors are encountered in the _p_l_a_c_e_m_e_n_t__f_i_l_e, the default procedure
- for ddddppppllllaaaacccceeee is to print a diagnostic message to standard error specifying
- where the error occurred in the _p_l_a_c_e_m_e_n_t__f_i_l_e and abort execution. If
- errors are encountered in the libdplace.so library during the run-time
- execution of _p_r_o_g_r_a_m, then a diagnostic message is sent to standard
- error, a default signal of SIGKILL is sent to all members of the process
- group, and execution is aborted.
-
- The mmmmooooddddeeee ssssiiiiggggnnnnaaaallll eeeexxxxpppprrrr statement allows a selection of a specific signal
- number to be generated upon error. If the mmmmooooddddeeee ssssiiiiggggnnnnaaaallll eeeexxxxpppprrrr is specified,
- the action taken when libdplace.so detects a run-time error is to send
- the signal number derived from eeeexxxxpppprrrr to the _p_r_o_g_r_a_m invoked by ddddppppllllaaaacccceeee.
- Under this condition the control is returned to the caller, which is the
- _p_r_o_g_r_a_m. The signal number can range from 1 to 32.
-
- An example of how to set the signal number:
-
- mmmmooooddddeeee ssssiiiiggggnnnnaaaallll 11116666
-
- Upon detecting an error in libdplace.so during run-time, signal 16
- (defined as SIGUSR1) is sent to the calling process (in this case the
- _p_r_o_g_r_a_m and control is returned to the caller.
-
-
- SSSSEEEEEEEE AAAALLLLSSSSOOOO
- ddddppppllllaaaacccceeee(3), ddddppppllllaaaacccceeee(5), ddddpppprrrrooooffff(1), nnnnuuuummmmaaaa(5), mmmmmmmmcccciiii(5), ddddllllooooooookkkk(1), nnnnuuuummmmaaaa____vvvviiiieeeewwww((((1111)))),
- mmmmlllldddd((((3333)))).
-
-
-
-
- PPPPaaaaggggeeee 5555
-
-
-
-